Quantile Filtering and Learning
نویسندگان
چکیده
منابع مشابه
Quantile Matrix Factorization for Collaborative Filtering
Matrix Factorization-based algorithms are among the state-of-the-art in Collaborative Filtering methods. In many of these models, a least squares loss functional is implicitly or explicitly minimized and thus the resulting estimates correspond to the conditional mean of the potential rating a user might give to an item. However they do not provide any information on the uncertainty and the conf...
متن کاملQuantile Reinforcement Learning
In reinforcement learning, the standard criterion to evaluate policies in a state is the expectation of (discounted) sum of rewards. However, this criterion may not always be suitable, we consider an alternative criterion based on the notion of quantiles. In the case of episodic reinforcement learning problems, we propose an algorithm based on stochastic approximation with two timescales. We ev...
متن کاملQuantile based noise estimation for spectral subtraction and Wiener filtering
Elimination of additive noise from a speech signal is a fundamental problem in audio signal processing. In this paper we restrict our considerations to the case where only a single microphone recording of the noisy signal is available. The algorithms which we investigate proceed in two steps: First, the noise power spectrum is estimated. A method based on temporal quantiles in the power spectra...
متن کاملSecond-order quantile bounds in online learning
These notes reflect the contents of the oral presentation of the paper Koolen and van Erven (2015) given in the journal club. After introducing the Hedge setting and providing some context, including a minimax regret bound, we discuss two kinds of adaptivity to “easy” data: second-order bounds and quantile bounds. We then describe the Squint algorithm proposed by Koolen and van Erven (2015) and...
متن کاملDistributional Reinforcement Learning with Quantile Regression
In reinforcement learning an agent interacts with the environment by taking actions and observing the next state and reward. When sampled probabilistically, these state transitions, rewards, and actions can all induce randomness in the observed long-term return. Traditionally, reinforcement learning algorithms average over this randomness to estimate the value function. In this paper, we build ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SSRN Electronic Journal
سال: 2009
ISSN: 1556-5068
DOI: 10.2139/ssrn.1509808